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Abstract 

The Goulden-Jackson cluster method is a powerful tool for obtaining generating 
functions for counting words in a free monoid by occurrences of a set of subwords. We 
introduce a generalization of the cluster method for monoid networks, which generalize 
the combinatorial framework of free monoids. As a sample application of the generalized 
cluster method, we compute bivariate and multivariate generating functions counting 
Motzkin paths—both with height bounded and unbounded—by statistics correspond¬ 
ing to the number of occurrences of various subwords, yielding both closed-form and 
continued fraction formulae. 


1. Introduction 

Given a finite or countably infinite set A, let A* be the set of all finite sequences of elements 
of A, including the empty sequence. We call A an alphabet , the elements of A letters , 
and the elements of A* words. By defining an associative binary operation on two words 
by concatenating them, we see that A* is a monoid under the operation of concatenation 
(where the empty word is the identity element), and we call A* the free monoid on A. The 
length 1 (a) of a word a G A* is the number of letters in a. For a,/5 6 A*, we say that f 3 is 
a subword of a if a = 71/U2 for some 71,72 G A*, and in this case we also say a contains f 3 . 

More generally, a free monoid is a monoid isomorphic to a free monoid on some alpha¬ 
bet. The combinatorial framework of free monoids is useful for the study of combinatorial 
objects that can be uniquely decomposed into sequences of “prime elements”, corresponding 
to letters in an alphabet. This framework can furthermore be generalized using what are 
called “monoid networks”, which were first introduced by Gessel p| Chapter 6] in a slightly 
different yet equivalent form called “G-systems”@ Roughly speaking, a monoid network con¬ 
sists of a digraph G with each arc assigned a set of letters from an alphabet A, in which 

1 The term “G-system” was dropped at the request of Ira Gessel, who prefers the name “monoid network” 
given by the author. 
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the set of sequences of arcs in G is given a monoid structure and is equipped with a monoid 
homomorphism. 

The Goulden-Jackson cluster method allows one to determine the generating function 
for words in a free monoid A* by occurrences of words in a set B C A* as subwords in 
terms of the generating function for what are called “clusters” formed by words in 5, which 
is easier to compute. As its name suggests, this celebrated result was first given by Goulden 
and Jackson in [2]. The cluster method has seen a number of extensions and generalizations 
mismnninBjEsii], and the cluster method itself can be viewed as a generalization of 
the Carlitz-Scoville-Vaughan theorem, which allows one to count words in a free monoid 
avoiding a specified set of length 2 subwords. 

In this paper, we give a new generalization of the Goulden-Jackson cluster method of a 
different flavor: we generalize the cluster method to monoid networks, which gives a way of 
counting words in A* corresponding to walks between two specified vertices in G (that is, 
words in a regular language if the alphabet A is finite) by occurrences of subwords in a set 
B. Then the original version of the cluster method corresponds to the special case in which 
G consists of a single vertex with a loop to which the entire alphabet A is assigned. 

The organization of this paper is as follows. In Section 2, we give an expository account of 
the original Goulden-Jackson cluster method. In Section 3, we introduce the combinatorial 
framework of monoid networks and present our generalization of the cluster method for 
monoid networks. Finally, in Section 4, we demonstrate how our monoid network version of 
the cluster method can be used to tackle problems in lattice path enumeration. 

Although many types of lattice paths can be represented as walks in certain digraphs, 
in this paper we focus on Motzkin paths, which are paths in Z beginning and ending at 0 
with steps —1, 0, and 1 (also called “down steps”, “flat steps”, and “up steps”, respectively). 
We consider both regular Motzkin paths and Motzkin paths bounded by height, and our re¬ 
sults include bivariate and multivariate generating functions counting these paths by ascents, 
plateaus, peaks, and valleys—all of which are statistics that are determined by occurrences 
of various subwords in the underlying word of the Motzkin path—as well as generating func¬ 
tions for Motzkin paths with restrictions on the heights at which these subwords can occur, 
yielding both closed-form and continued fraction formulae. Several interesting identities are 
uncovered along the way. 

2. The Goulden—Jackson cluster method 

We begin this section with a motivating problem: let A be a finite or countably infinite 
alphabet and suppose that we want to count words in A* that do not contain a specified set 
B of forbidden subwords of length at least 2. The Goulden-Jackson cluster method allows 
us to count this restricted set of words by counting “clusters” formed by words in B, which 
we shall define shortly. 

Given a word a = aia 2 • • ■ a n G A* (where the a* are letters) and a set B C A*, we say 
that (i,/3) is a marked subword of a if [3 G B and 

fd i * * * l; 

that is, fd is a subword of a starting at position i. Moreover, we say that (cc, S) is a marked 
word (on a) if a E A* and S is any set of marked subwords of a. 
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For example, suppose that A = {a, b, c} and B = {abc, bca}. Then 


{abcabbcabc, {(1, abc), (2, bca), (6, bca)}} , (1) 

is a marked word which can also be displayed as 

(a^^a)6(^£^a)6 c. 

The concatenation of two marked words is defined in the obvious way. For example, © 
can be obtained by concatenating {abca, {(1, abc), (2, bca)}} and {bbcabc, {(2, bca)}}, i.e., 

( a(bcja) and b(h ca)b c . 


A marked word (on a) is called a cluster (on a) if it is not a concatenation of two 
nonempty marked words. So, (JT]) is not a cluster, but 


is a cluster. Two additional examples of clusters, using A = {a} and B = {aaaa}, are 


and 


which we include to emphasize the fact that a cluster is not required to be “maximal” in 
the sense that every possible marked subword must be included. If a word a has only one 
possible cluster, then there is no need to indicate the positions of the marked subwords and 
we say (by abuse of language) that the only cluster on a is itself. 

Before formally presenting the cluster method, we introduce some additional notation. 
For a word a G A*, let bad(a) be the number of occurrences in a of words in B and let 
C a be the set of all clusters on the word a. Given a cluster c, let mk(c) be the number of 
marked subwords in c. Given an indeterminate t that commutes with all of the letters in A, 
define 

F(t) = c^ bad(Q) 

aeA* 

and 

Y«Y imk<c) . 

cx^A* cGCq; 



so that F(t) is the generating function for words in A* by the number of occurrences of words 
in B, and L(t) is the generating function for clusters by the number of marked subwords. 
Both F{t) and L(t) are elements of the formal power series algebra K((A*)) [[£]], where K 
is a held of characteristic zero (which we can take to be C) and K((A *))—called the total 
algebra of A* over K —is the algebra of formal sums of words in A* with coefficients in K. 
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Theorem 1 (Goulclen-Jackson cluster method, version 1). Let A be an alphabet and let 
B C A* be a set of words of length at least 2. Then, 

m = 

' aGA 



Proof. We prove the equivalent statement 


We have 


ni+*) = (l-^o-L(i) 


igA 


-l 


F(l+t) 


J2 “(i+<) badl “ ) 

aGA* 



tlS| ’ 

aGA* SCB q 


( 2 ) 


where B a is the set of occurrences of words in B in a. Note that (J2]) counts marked words 
weighted by the number of marked subwords that it contains, and from here it is easy to see 
that 


+ 1) = a tlsl 

eA* SCB a 

1 -^a-L(t) 


a£A 


-1 


since every marked word is uniquely built from a sequence of letters in A and clusters. 
We indicate three specializations of Theorem |T] that are of particular importance: 


□ 


• By setting t — 0, we obtain 

(l-J>-L(-l) 

' aGA 

as the generating function for words in A* that do not contain any words in B, which 
solves the problem posed at the beginning of this subsection, assuming that we can 
compute the cluster generating function L(t). 

• If every word in B has length exactly 2, then setting t = 0 yields the Carlitz-Scoville- 
Vaughan theorem, which was independently discovered by Froberg [7J Section 4], Car- 
litz et al. [21 Theorem 7.3], and Gessel |8] Theorem 4.1]. In fact, Chapters 4 and 5 of 
Gessel’s doctoral thesis [[8] are devoted to the Carlitz-Scoville-Vaughan theorem and 
its many enumerative applications. 
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By setting t — 1, we obtain the free monoid identity 






a£A 


( 3 ) 


More generally, we can assign each word in B its own indeterminate. Write B = {(3i, /3 2 ,...} 
so that the words in B are ordered. (Here, B is presented as countably infinite although in 
most applications it is finite.) Given a word a € A*, let bad*(a) be the number of occurrences 
of f3i in a, and given a cluster c, let mk*(c) be the number of marked subwords in c of the 
form ( j,/3i ). Let t\,t 2 ,... be indeterminates that commute with each other and with the 
letters of A, and define the generating functions 

OO 

F(h,h...)= 

qSA* k =1 

and 

OO 

mu , £“£IK k ‘ |c) . 

aeA * ceCc k=1 

Then we have a refinement of Theorem [0 which follows by the same reasoning as before: 

Theorem 2 (Goulden-Jackson cluster method, version 2). Let A be an alphabet and let 
B = {/Si, fd 2 , • • • } Q A* be a set of words of length at least 2. Then, 

F(h, t 2 ...) = ( 1 - a - L ( tx - M 2 - 1, ■ ■ •) 

' a£A 

The statement of Theorem [2] uses an infinite set B and infinitely many indeterminates tj, 
but it is clear that the finite case works as well. The number of indeterminates also does not 
need to equal the number of words in B ; for example, we can have B = {/3i,, /3k} along 
with two indeterminates t,\ and t 2 , and attach t\ to all /3 ? ; with i odd and attach t 2 to all fa 
with i even. 

As an example, let A = {a, b, c} and suppose that we want to count words in A* by 
occurrences of j3\ = acb and /3 2 = be. Then the only clusters are acb, be, and aebe , so 

L(ti, t 2 ) = aebt 1 + bct 2 + acbctit 2 

and by Theorem [2], we obtain 

F{h, t 2 ) = (1 — a — b — c — acbfti — 1) — bc(t 2 — 1) — acbc(t\ — 1 )(^2 — l)) -1 (4) 

as the generating function for words in A* by occurrences of acb and be. By setting t\ = 
t 2 = 0, we obtain 

(1 — a — b — c + acb + be — aebe) -1 (5) 

as the generating function for words in A* which contain neither acb nor be. 
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Now, let x be an indeterminate that commutes with t\ and t 2 . If we apply the homomor¬ 
phism sending each of the letters to x, we obtain the generating functions 

1 

1 — 3x — x 2 {t 2 — 1) — x 3 (ti — 1) — x A (t\ — 1) [t 2 — 1) 

and 

1 

1 — 3x + x 2 + x 3 — x 4 

from (HD and (J5]), respectively, which keep track of these words by length. 

We say that the set B is reduced if no word (3 £ B is a subword of another word (3' in B. 
Although the cluster method as presented above works regardless of whether B is reduced, 
Goulden and Jackson gave a formula in their original paper [9j for the cluster generating 
function when A and B are finite sets with B reduced. A set B of forbidden subwords 
can always be replaced by a reduced set and still yield the same restricted set of words; if 
(3 £ B is a subword of (3' £ B , then we can remove (3' from B because containing j3' implies 
containing j3. However, the criterion of having a reduced set can be an issue if we want to 
count words by occurrences of subwords (that is, without setting t = 0). For instance, we 
would not be able to use Goulden and Jackson’s formula to compute the cluster generating 
function given B = {aba, abab} since aba, is a subword of abab. 

As part of [3J, Noonan and Zeilberger wrote a Maple package that handles the case 
where B is arbitrary (i.e., not necessarily reduced), but without a detailed explanation of 
their algorithms. Bassino et al. pj later gave an explicit expression for the cluster generating 
function in the non-reduced case. We omit these formulae of Goulden and Jackson and 
Bassino et al. because the cluster generating functions in Section 4 of this paper will require 
essentially no computation. 

3. Our generalization of the cluster method 

3.1. Monoid networks 

Throughout this section, fix a held K of characteristic zero and let A be a finite or countably 
infinite alphabet. As in the previous section, K((A*)) is the total algebra of A* over K. We 
also let Mat m (A"((A*))) denote the algebra ofmxm matrices with entries in K((A*)). 

Let G be a digraph on the vertex set [m] such that each arc (i,j) of G is assigned a set 
of letters in A, and let P be the set of all pairs (a, e) where e = (i,j) is an arc of G 
and a £ P ir Define P* C P* to be the subset of all sequences a = (ai, e\)(a 2 , e 2 ) ■ ■ ■ (a n , e n ) 
where eie 2 • • • e n is a walk in G. Given a = (ai, ei)(a 2 , e 2 ) • • • (a n , e n ) in P*, we define 
p(a) = cqa 2 • • • a n to be the word obtained by projecting onto A* and let E(a ) = (i,j) where 
i and j are the initial and terminal vertices, respectively, of the walk eie 2 • ■ • e n . 

For example, consider the following: 


{a,c} 



{6,c} 


6 








Here P = {(6, (1,1)), (a, (1, 2)), (c, (1, 2)), (6, (2,1)), (c, (2,1))}. One element of P* is a = 
(i b , (2,1 ))(&, (1, l))(a, (1, 2)), and so p{a) = bba and E(a) = (2, 2). 

We say that (G, P ) a monoid network on A* if for all nonempty a, (3 € P*, if p(a) = p(/3) 
and P(o) = E(j3) then a — /3. That is, the same word in A* cannot be obtained by traversing 
two different walks with the same initial and terminal vertices. It is easy to see that (G, P) 
in the example given above is a monoid network. 

We can very naturally represent words in P* using matrices. For each element p = 
(a, (i,j)) G P, we associate p with the m x m matrix M p with a in the (i,j) entry and 0 
everywhere else, which defines a monoid homomorphism A : P* —> Mat m (A” ((A*))), where we 
consider the codomain as the multiplicative monoid of the algebra Mat m (AT ((A*))). Applying 
A to the e mp ty word 1 gives the ?nxm identity matrix I m . 

If a E P* and E{a) = (i,j) , then A (a) is the mxm matrix with p(a) in the (i,j) entry 
and 0 everywhere else; we denote this matrix M a . If a ^ P*, then M a = A (cu) = 0 m , the 
mxm zero matrix. 

Returning to the example above, the matrices M p are 


'b 

o' 


0 a 


'0 c 


'0 

o' 

, and 

'0 

O' 

0 

0 

5 

0 0 

5 

0 0 

? 

b 

0 

c 

0 


and for a = ( b , (2, !))(&, (1, l))(a, (1, 2)), we have 


A(a) 


0 0 
0 bba 


We then extend A by linearity to an algebra homomorphism K((P*)) —> Mat m (A'((A*))), 
which we also call A by a slight abuse of notation. Given a monoid network (G, P) and a 
subset S C A* : let Tc-(A) e Mat m (AT((A*))) be the matrix whose (i,j) entry is the generating 
function for words in S that can be obtained by traversing a walk from i to j in G. It is 
clear that 

f c (S) = Y,M« 

aev 

where V is the set of all words a G P* such that p(a) G S. 

If the alphabet A is finite, then the idea of monoid networks may seem too similar to finite- 
state automata to warrant its own definition, but our approach is novel and is based on the 
monoid structure of P* and the power of the homomorphism A. Moreover, our construction 
generalizes the combinatorial framework of free monoids, hence the name “monoid network”. 
For example, the following is an elementary result traditionally proven using the transfer- 
matrix method (see [HI Section 4.7] or [6j Section V.6]), but we can give a very simple proof 
using the homomorphism A: 

Theorem 3. Suppose that ( G,P ) is a monoid network on A*. Then 

?g(A*)= 

^ peP 
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Proof. Take 


-l 


£<>= k-EH > 

ot£P* ' pGP ' 


which is (J3]) applied to the free monoid P*, and then apply A to both sides of the equation. 


□ 


Our proof of the generalized Goulden-Jackson cluster method presented later in this 
section is of a similar flavor. 

Continuing with the example above, we have 



1 — 6 —a — c 
—b — c 1 


by Theorem [3J If we want the generating function for words by length that can be obtained 
by traversing a walk from 1 to 2 in (G,P), then we apply to Tg(A 1*) the homomorphism 
sending each of the letters to x to obtain the matrix 


1 — x —2x 
-2x 1 


1 


2x 


x — Ax 2 
2x 


1 — x — Ax 2 

1 — x 


x 


Ax 2 1 


x 


Ax 2 


and then take the (1, 2) entry. 

We give one last remark before presenting our generalization of the cluster method. In 
the paper [20], the present author generalizes a theorem by Gessel which enables one to 
count words and permutations with restrictions on the lengths of their increasing runs. This 
generalized run theorem allows for a much wider variety of restrictions on increasing run 
lengths; specifically, these restrictions are those which can be encoded by a special type 
of digraph called a “run network”. Run networks are in fact monoid networks where the 
alphabet A is P (the positive integers), but the language of monoid networks was not used 
in the exposition of because the proof of the generalized run theorem presented in that 
paper did not make use of the monoid structure of P* or the homomorphism A. However, it 
is possible to prove the generalized run theorem—albeit in a more complicated way—using 
a monoid network version of the Carlitz-Scoville-Vaughan theorem. 


3.2. The Goulden—Jackson cluster method for monoid networks 

To motivate our generalization of the Goulden-Jackson cluster method, let us combine two 
previous examples and suppose that we want to count words on the alphabet A = {a, 6, c} 
that satisfy two conditions. First, these words cannot contain any occurrences of = acb 
and /?2 = be, and second, these words must be obtainable by traversing a walk from vertex 
1 to vertex 2 in the following monoid network (G, P ): 


{a,c} 



{b,c} 
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We can do this using our monoid network version of the Goulden-Jackson cluster method, 
which we now present in full generality. Let A be an alphabet and let B = {/3 l5 /3 2 , - - - } Q A* 
be a set of words. Moreover, let (G, P ) be a monoid network with m vertices, and for each 
u, let B u be the set of all words a in P* with p(a) = f3 u , and let if = IJ^i B u . 

Define , t 2 , ■ ■ ■) to be the m x m matrix whose (i,j) entry is the sum 

OO 

a k=l 

over all a € P* with E(a) = ( i,j ), which is the same as the sum 

OO 

5>nr‘ ( “> 

a k =1 

over all a G A* that can be obtained by traversing a walk from vertex i to vertex j in the 
monoid network ( G,P ). Furthermore, dehne 


Qi(zp^ CGlCol k=l 

where C a is the set of all clusters (formed by words in if) on the word a, and mk„(c) 
is the number of marked subwords in c of the form ( v , 7 ) with 7 e We will refer to 
%{U , t 2 , ■ ■ ■) as the cluster matrix. 

Theorem 4 (Goulden-Jackson cluster method for monoid networks). Let A be an alphabet 
and let B = {/5i, ^ 2 ,... } C J* be a set of words of length at least 2. Also, let G be a digraph 
on [m] and let ( G,P ) be a monoid network on A*. Then, 

~^G(t 1) ■ ■ ■) = Mm - AI p — Lcifi — 1, t 2 — 1, • ■ ■ ) 

^ p£P 



Proof. First, apply the original Goulden-Jackson cluster method (Theorem [2]) for the al¬ 
phabet P and the set B , where we attach the indeterminate t u to each word in B u . Then 
applying the homomorphism A yields the desired result. □ 


As before, the set of words in B need not be infinite, and the number of indeterminates 
can be less than the number of words in B. It is also possible to alter the cluster matrix to 
only include clusters occurring at specified positions in the monoid network, which we do so 
in Section 4 to count Motzkin paths with no occurrences of subwords at specified heights. 

We mention three specializations which are completely analogous to those given after 
Theorem [Q 


• By setting each indeterminate equal to 0, we obtain 



pgp 


Lg( 


- 1 ,- 1 ,... 


-1 
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as the m x m matrix whose (i,j) entry is the sum a over a e A* that can be 
obtained by traversing a walk from vertex i to vertex j in the monoid network (G, P ) 
and contain no occurrences of words in B. 

• If every word in B has length exactly 2, then setting each indeterminate equal to 0 
yields a monoid network version of the Carlitz-Scoville-Vaughan theorem. 

• Setting each indeterminate equal to 1 gives an alternative proof for Theorem |3J 

Observe that the original Goulden-Jackson cluster method corresponds to the special case 
in which the monoid network consists of a single vertex with a loop to which the entire 
alphabet A is assigned. Thus Theorem [4] can accurately be characterized as a generalization 
of the Goulden-Jackson cluster method. 

Finally, we note that if the alphabet A is finite, then a monoid network gives the transi¬ 
tion diagram of a nondeterministic finite automaton. Nondeterministic finite automaton are 
equivalent to deterministic finite automaton, and the transition diagram of a deterministic 
finite automaton is a monoid network. Therefore, Theorem |4] can be used to count words 
in a regular language by occurrences of a specified set of subwords. (For relevant defini¬ 
tions regarding formal languages and automata, see any introductory text on the theory of 
computation.) 

Let us now complete the example from earlier. We have 


t a (t u t 2 ) 


acb 0 
0 0 


+ 


0 0 
0 be 


to + 


0 be 
0 0 


to + 


0 aebe 
0 0 


tit 


1 L 2 


acbti bct 2 + acbctit 2 
0 bct2 

indeed, recall that the only three clusters formed by the words acb and be are acb, be, and 
aebe, which can be obtained in the given monoid network by traversing walks with initial 
and terminal vertices indicated in the matrices above. Thus, 

= (yl'i — M p — La(ti — l,t 2 — 1 ) 

P&P 

-l 


'1 o' 


b 

a + c 


acb(ti — 1) bc(t 2 — 1) + acbc(ti — l)(t 2 — 1) 

0 1 


b + c 

0 


0 bc(t 2 - 1) 


-l 


1 — b — acb(ti — 1) —a — c — bc{t 2 — 1) — acbc{t\ — 1) (t 2 — 1) 

—b — c 1 — bc{t 2 — 1) 

Now we apply the homomorphism sending each of the letters to x, yielding the matrix 

1 — x — x 3 (ti — 1) —2x — x 2 (t 2 — 1) — x 4 (ti — 1 )(t 2 — 1) 

— 2x 1 — x 2 {t 2 — 1) 


whose (1,2) entry is 


2x - (1 - t 2 )x 2 + (1 — ti - t 2 + tit 2 )x A 


1 - x - (3 + t 2 )x 2 + (2 — ti - t 2 )x 3 — (1 — ti — t 2 + tit 2 )x b ’ 
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which is the generating function for words obtained by traversing a walk from vertex 1 to 
vertex 2 in the given monoid network, weighted by length, occurrences of acb , and occurrences 
of be. Setting t\ = t 2 = 0 gives the generating function 

2 x — x 2 + x 4 
1 — x — 3x 2 + 2x 3 — x 5 

for those words that do not contain any occurrences of acb or be. 

We also state a weighted version of Theorem Q] Let {Wa : ( a , (i, j)) £ P} be a set 
of weights that commute with each other, the indeterminates £i,£ 2 , ■ ■ •, and the letters in 
A. Set Wa' J) = 0 if ( a,(i,j )) ^ P. Given a = a\a 2 ---a,k G A* and 1 < i,j < m, let 
ic(* J )(a) = ■ ■ ■ wff if there exists j3 = (ai, ei) • • ■ (a*,, e^) £ P* such that E(/3) = (■ i,j ) and 

p(/3) = a. 

Define the map A : P* —> Mat m (K{(A*))) by sending p = ()) to the matrix M p 
with Wa'^a in the (i,j) entry and 0 everywhere else. If a = (ai, e±) ■ ■ ■ (a n , e n ) £ P* and 
E(a) = ( i,j ), then A(a)—which we also denote M a —has vf a \ ■ ■ -w^p(a) in the (i,j) entry 

and 0 everywhere else, and if a ^ P* then M a = 0 m . Again, A extends to a homomorphism 
K((P*)) —> Mat m (iL((A*))), which we also call A. Note that setting all of the weights equal 
to 1 gives A = A. 

Theorem 5 (Goulden-Jackson cluster method for monoid networks, weighted version). Let 
A be an alphabet and let B = {fi \, /9 2 ,...} C. A* be a set of words of length at least 2; let 
G be a digraph on [m] and let ( G , P) be a monoid network on A*; let Fc(ti,t 2 ,...) be the 
mxm matrix whose (i,j) entry is the sum 

OO 

£ ro «>( a M a )nr‘ w “ ,) 

a k =1 

over all a £ P* with E(a ) = (i , j ) / and let 

OO 

La(tu-...U)= £4-£IK Mc) . 

c£.Cot k = 1 

Then, 

Pg(Ll, t2 ■ ■ ■ ) = — M p — Lc{t\ — 1 , t2 — 1,. ■.) 

^ p&p 

The proof is the same as that of Theorem [4j except that we apply A instead of A. 

Although we will not use the weighted version of our main theorem in subsequent sec¬ 
tions, we note that it can be used with the monoid network framework to examine time- 
homogeneous Markov chains, which are probabilistic analogues of finite-state automata. 
Specifically, let (G, P ) be a monoid network with m vertices, and for every a £ A and 
i , j £ [m], let Wa’^ £ [0,1] such that Wa'^ = 0 if (a, (i,j)) <£. P and 

m 

j=1 a^A 



11 



for each fixed 1 < i < m. With a choice of initial vertex and terminal vertex, we can think of 
this monoid network as a random word model, where a word is given by traversing a random 
walk in G from the initial vertex to the terminal vertex with wif J> being the probability that 
at vertex i, the next letter in the word will be a and the next arc (i, j). Using Theorem 0 we 
can then compute probabilities associated with this random process, such as the probability 
that a length n word obtained from traversing a walk between two specified vertices avoids 
a specified set of forbidden subwords. 


4. An application to lattice path enumeration 

4.1. Representing lattice paths using monoid networks 

A path on Z fe with steps in S C Z fc is an ordered tuple (ao, ai, a 2 ,..., a n ) °f values in Z fc such 
that a* + 1 — at G S for every 0 < i < n. Equivalently, it is an ordered tuple (si, s 2 ,..., s n ) 
of values in S. Each step s G S is assigned a length in Z—which we take to be 1 unless 
otherwise noted—and the length of a path is the sum of the lengths of all of its steps Sj. 

These paths are collectively known as lattice paths. In particular, lattice paths on Z 
have been widely studied in the literature, usually with the conditions ao = = 0 and 

ai > 0 for every i. Examples of these paths include Dyck paths, which have steps in { — 1,1}; 
Motzkin paths, which have steps in {—1, 0,1}; and Schroder paths, which are Motzkin paths 
but with ‘O’ steps having length 2 instead of 1. These paths are often illustrated as paths 
in the plane starting at the origin, ending on the x-axis, and never going below the x-axis, 
with up steps (1,1) corresponding to 1, down steps (1,-1) corresponding to —1, and in the 
case of Motzkin or Schroder paths, flat steps (1,0) or (2,0), respectively, corresponding to 
0 . 

We say that a lattice path on Z has height bounded by m if we add the condition that 
ai < m for every i. Lattice paths with bounded heights correspond to walks in certain 
monoid networks. For example, a Dyck path with height bounded by m is a walk from 
vertex 0 to itself in the following monoid network: 



Here the alphabet is {U, D}, with U corresponding to an up step and D corresponding to 
a down step. The vertices represent the possible heights at each step of the path; indeed, a 
Dyck path with height bounded by m must begin and end at height 0, and its height must 
stay between 0 and m. 

We can also add a letter F for flat steps, and so we can represent Motzkin paths and 
Schroder paths using the following, which we call the Motzkin monoid network of order m: 
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Using monoid networks, we can model a wide variety of bounded lattice paths with 
different types of steps and various restrictions, so we may use the tools that we have for 
monoid networks to obtain generating functions for counting lattice paths of bounded height. 
Taking the formal power series limit as m —)■ oo yields analogous results for lattice paths of 
unbounded height. 

The idea of representing lattice paths as walks in digraphs and the transfer-matrix method 
are standard techniques in lattice path enumeration; see HU for a recent survey of the 
literature. Such an approach has not yet been combined with the Goulden-Jackson cluster 
method to count lattice paths by occurrences of subwords, which we shall do here. 

However, the original version of the cluster method was applied by Wang ra to count 
Dyck paths by occurrences of various subwords. His approach is fundamentally different in 
that it relies on recursive decompositions of paths and does not use the correspondence to 
walks in digraphs, whereas our method reduces almost all of the computations to matrix 
algebra. Because Wang conducted his investigation on Dyck paths, we shall instead focus 
on Motzkin paths in this paper. 

4.2. A note on continued fractions 

A finite continued fraction is an expression of the form 

h 

a o H- - -, 

02 

Oi H-:— 

. Om 

' • H- 

@"171 

which we will write as 

b\ b 2 b m 

CLq H-• • • — 

Cl i+ a 2 ~\~ CL m 

for compactness. We say that a finite continued fraction has depth m if it is written with m 
fraction bars when completely written out in this notation, so the continued fraction above 
has depth m. We write an infinite continued fraction 

bi 

ci 0 H-:- 

62 

ci 1 H- 

a 2 

as 

b x b 2 

ao H-• • • 

dl+ ®2 + 


13 















and say that it has infinite depth. 

Continued fractions arise naturally in combinatorics and especially in lattice path enu¬ 
meration; e.g., see Flajolet’s landmark paper [5]. Many of our results in this section are 
continued fraction formulae. 


4.3. Counting Motzkin paths by ascents 


Let A4™ be the set of Motzkin paths of length n with height bounded by m and A4 n the set 
of all Motzkin paths of length n. An ascent of a Motzkin path p is a maximal consecutive 
sequence of up steps in p, and let us define asc p to be the number of ascents in p. We also 
define 

oo oo 

F™fi X ,t) = Y^ J2 tasc/v * and F asc (x,t) = Y^ J2 tasc/v * 

n =0 n =0 ii&Mn 

to be bivariate generating functions for Motzkin paths with height bounded by m and regular 
Motzkin paths, respectively, weighted by length and number of ascents. Our main result here 
is the following: 

Theorem 6. Let {P^ c (x, t)} m >o be a sequence of polynomials recursively defined by 
P™ c (x,t) = (l-x- x 2 (t - l))P* S -lOM) - (z 2 + x\t - 1 ))P^ 2 (x,t) 


for m > 2 and P{fi c (x, t) = 1 and Pf sc (x, t) = 1 — x. Then 




PZfat) 

P m+ lOM) 

1 x 2 + x 3 (t — 1) x 2 + x 3 (t — 1) x 2 + x 3 (t — 1) 

1 — x — x 2 (t — 1)— 1 — x — x 2 (t — 1)— 1 — x — x 2 (t — 1)— 1 — X 

S -V-- 

depth m+1 


for m > 1 and 




1 - 
1 - 


1 x 2 + x 3 (t — 1) x 2 + x 3 (t — 1) 

x — x 2 (t — 1)— 1 — x — x 2 (t — 1)— 1 — x — x 2 (t — 1) — 
x — x 2 (t — 1) — y/l — 2x — x 2 (21 + 1) — 2 x 3 {t — 1) + x A (t — l) 2 
2{x 2 + x 3 (t — 1)) 


Proof. We apply the cluster method to the Motzkin monoid network of order m with B = 
{ UD , UF}, since the number of occurrences of the subwords UD and UF in a Motzkin path 
is equal to its number of ascents. We weight both UD and UF by t. The only clusters 
formed by UD and UF are themselves, and so we have the (m +1) x (m +1) cluster matrix 


L G (t) 


~UDt U Ft 

UDt U Ft 

UDt 


UDt U Ft 
0 
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Then, by Theorem [4j Fa(t) is the inverse matrix of A m — Lcit — 1), where A m is the 
(m + 1) x (m + 1) matrix given by 


A 


m 


1 — F -U 
-D 1 -F -U 

-D 1 -F 


1 -F -U 
-D 1 -F 


Thus, F™ c (x, t) is the (1,1) entry of M m 1 where M m is the (m + 1) x (m + 1) matrix 

"1 — x — x 2 (t — 1) — x — x 2 (t — 1) 

—x l — x — x 2 (t—l) —x — x 2 (t — 1) 

—x 1 — x — x 2 (t — 1) 


1 — x — x 2 (t — 1 ) — x — x 2 [t— 1 ) 

—X 1 — X 

obtained by applying the homomorphism sending U, D, F (->• x to A m — Lc(t — 1)- By 
Cramer’s rule, we can compute this generating function as the quotient of two determinants 


*tom) 


det M m _ i 
det M m 


Using column-addition matrix operations, which preserve the determinant, we can then 
transform M m into an upper-triangular matrix with diagonal entries 


^2,2 


{ 1 — x — x 2 {t — 1) 
1 - X, 


x 2 + x 3 (t — 1) 
^*+ 1 , 1+1 


if 1 <i<m 

if i — m + 1. 


From here we deduce the recursive expression 

ra+1 

det iWrM 11 
2— 1 


= 1 — x — x 2 (t — 1) — 


x 2 + x 3 (t — 1) 


det Mm -1 
det M m - 2 


det M m _ i 


= (1 — a: — x 2 (f — 1)) det M m -i — (x 2 + x 3 (t — 1)) det M m _ 2 


with initial conditions det M_ x = 1 and det M 0 = 1 — x. Hence, these determinants are 
polynomials, and we write P“ c = det M m _ x . Moreover, 

det M m 2 x 2 T x 3 (t — 1) x 2 + x 3 (t— 1) x 2 + x 3 (t — 1) 

det M m _i X X 1 — x — x 2 [t — 1)— 1 — x — x 2 (t — 1)— 1 — x 

'»--V--^ 

depth m 
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so 


K C M = 


x 2 + x 3 {t — 1) 


x 2 + x 3 (t — 1) x 2 + x 3 (t — 1) 


1 — x — x 2 (t — 1)— 1 — x — x 2 (t — 1)— 1 — x — x 2 {t — 1)— 1 — x 


-S/- 

depth m+1 


( 6 ) 

We now proceed to Motzkin paths unbounded by height. By taking the limit of (J6]) as 
m —> oo, this sequence of formal power series converges to the infinite continued fraction 


F asc (;r,t) = 


x 2 -f- x 3 (t — 1) x 2 + x 3 (t — 1) 


1 — x — x 2 (t — 1)— 1 — x — x 2 (t — 1)— 1 — x — x 2 (t — 1)- 
Equation ([7]) gives the recursive expression 

F asc (x, t) = 1 


1 — x — x 2 (t — 1) — (x 2 + x 3 (t — 1 ))F asc (x, t ) 


or 


(x 2 + x 3 {t- 1 ))F asc 0 r, t) 2 - (1 - X - x 2 (t - 1 ))F asc (x, t) + 1 = 0, 
and solving this functional equation gives 

c 1 — x — x 2 (t — 1) ± \/l — 2x — x 2 (2t + 1) — 2x 3 (t — 1) + x 4 (t — l) 2 

F 2(x 2 + a: 3 (t-l)) 

but one can easily check that the subtraction solution is the correct one. 

The first several terms of F^ c {x,t) are in the following table: 


(7) 


□ 


n 

[x n ] F asc (a:, t ) 

n 

[x n ] F asc (x, t) 

0 

1 

5 

1 + 14f + 6 1 2 

1 

1 

6 

1 + 26t + 23 1 2 + t 3 

2 

1 + t 

7 

1 + 46f + 70 1 2 + 10t 3 

3 

1 + 3 1 

8 

1 + 79t + 186f 2 + 56t 3 + t 4 

4 

1 + 7 t + t 2 

9 

1 + 133t + 451f 2 + 235t 3 + 15t 4 


These numbers are in the OEIS [T51 A114580]. Notice that the constant terms of these 
polynomials are all 1; the only Motzkin paths with no ascents consist of all flat steps, and 
there is exactly one of each length. We also obtain an expression for the linear coefficients, 
which count Motzkin paths with exactly one ascent. 


Corollary 7. Let Fib(i) denote the ith Fibonacci number defined by Fib(0) = 0, Fib(l) = 1, 
and Fib(n) = Fib(n — 1) + Fib(n — 2) for n > 2. Then the number of Motzkin paths of length 
n > 1 with exactly one ascent is equal to Fib(n + 3) — n — 2. 


Proof. Using Maple, one may verify that 




X 


t =0 


(1 — x — x 2 )(l — x) 2 ' 
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It is known that 


x 


(1 — x — x 2 )(l — x) 2 

is the generating function for the sequence Fib(n + 4) — n — 3 (see 


A001924]). Then, 


[x n t] F plt (x, t) = [x r 


= \x 


ri— ll 


x 


t =0 


(1 — x — x 2 )(l — x) 2 
= Fib(n — 3) — n — 2. 


□ 

The leading coefficients of the even-degree polynomials are 1; a Motzkin path of length 2 n 
has at most n ascents, and only when the path is ( UD) n . A Motzkin path of length 2 n + 1 
also has at most n ascents, and we show that the leading coefficients of the odd-degree 
polynomials are the triangular numbers (see [15 , A000217]): 

Proposition 8. The number of Motzkin paths of length 2 n + 1 with n ascents is ("+ 2 ) ■ 

Proof. The maximum number of ascents that a Motzkin path of length 2 n + 1 can have is 
n. Fix such a path /i, and let k be the number of subwords UD that occur at height 0 in ji. 

• If k = n, then the remaining step (which must be a flat step) can be in k + 1 possible 
positions: at the beginning, at the end, or between two consecutive occurrences of UD. 

• If k < n, then it is easy to see that in order for /j, to have n ascents, the remaining 
steps must form the subword UF(UD) n ~ k ^ 1 D beginning at height 0. Again, there are 
k + 1 possible positions for this subword: at the beginning, at the end, or between two 
consecutive occurrences of UD. 

Summing over all fc, we conclude that the number of Motzkin paths of length 2n + 1 with n 
ascents is equal to 

□ 

We can also use the generalized cluster method to count Motzkin paths with ascents ending 
only at specified heights. Let P be the set of positive integers, N the set of non-negative 
integers, E the set of positive even integers, O the set of positive odd integers, and E>o the 
set of non-negative even integers. 

Theorem 9. Let ACP and, let 


F asc (A] x) = J^c n x n 

77.—0 


17 






where c n is the number of Motzkin paths of length n with every ascent ending at a height in 
A. Then, 

F ^(A-x) = 1 ~ x ° 4 x2 ~ xC2 ... 

1 ; 1 - x + Ci- 1 - x + C 2 - 1 - x + C 3 - 


where 


Q 


x 2 , ifi^A 
0, otherwise. 


Proof. We weight both UD and UF by t, but we only wish to consider instances of these 
subwords occuring at impermissible heights as we will be setting t — 0 afterward. The 
impermissible heights are i — 1 where i fz A, so that the corresponding ascents end at height 
i. Thus, following the proof of Theorem [6j we take the cluster matrix ~tc(t) but delete all 
entries in rows % — 1 with i G A. We obtain the result by applying the cluster method, 
using matrix operations to obtain a continued fraction formula, and then taking the limit as 
m —y oo—all in the same way as before—and finally by setting t — 0. □ 


For example, taking A = E and A = O, we obtain 


F asc ( E;x) 


x 


x 


X 


X 


X 


X 


1 — x + x 2 — 1 — x— 1 — X + x 2 — 1 — x— 1 — X + X 2 — 
1 — 2x + 2x 2 — \/l — 4x + Ax 2 — 4x 4 + 4x 5 


2(x 2 — x 3 + x 4 ) 

1 + x + x 2 + x 3 + 2x 4 + 5a: 5 + 12a: 6 + 27a; 7 + 60a: 8 + 135a; 9 + 309x 10 + • • • 


and 


F asc (0; x) = 


x* 


2 2 
X ~ X 


X 


2 .2 
X ~ X 


1 — x— 1 — x + x 2 — 1 — a’— 1 — x + x 2 — 1 — x— 
1 — 2x + 2x 2 — 2x 3 — y/1 — Ax + 4a 2 — 4a 4 + 

/ O Q A \ 


4a: 5 


= 1 + r+ 2x 2 4- 4.r 3 


2 (a 2 — 2a 3 + a 4 ) 






as the generating functions for Motzkin paths with all ascents ending at even heights and 
odd heights, respectively!! 

One can produce a refinement of Theorem [9] that also keeps track of the number of 
ascents. Rather than deleting rows in the cluster matrix, assign each UD and UF in those 
rows a weight of u. After setting t — 0, the remaining indeterminates a and u would keep 
track of length and number of ascents, respectively. 

ft is also possible to count paths with restrictions on the heights at which ascents begin, 
but the analysis is slightly more complicated. Here we would want to set B = {DU, FU}, 
which suffices for Motzkin paths that do not begin with an ascent. However, Motzkin paths 
that begin with an ascent can be counted by considering walks in the monoid network 

2 We note that the coefficients of F asc (E;a;) match OEIS sequence [T5] A190171] up to x 10 and the 

coefficients of f asc (0;a) match OEIS sequence [15, A110334] up to x 12 , but begin to deviate afterward. 
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from vertex O' to vertex 0, and we would multiply the result by t at the end to take into 
account the first ascent. 


4.4. Counting Motzkin paths by plateaus 

We now count Motzkin paths by occurrences of UF k D , which we call a k-plateau Jf| For a 
fixed k, let plt fc (/i) be the number of /c-plateaus of a Motzkin path /i, and let 


E E t vto k M x n and F pltk (x,t) = EE M x n . 

n =0 n =0 fi£M n 


Then we have the following formulae: 

Theorem 10. Let {Pm tk (x,t)} m > o be a sequence of polynomials recursively defined by 
P m k ( x , t) = (1 - x - x k+2 {t - l))P„ p , 1 l fc 1 (a:, t) - x 2 P^_ k 2 (x, t ) 
for m>2 and Pg ltfe (a:, t) — 1 and P pltfc (a;, t) — 1 — x. Then 


P rn k {x, t) 


Pm tk (X, t ) 


1 x 2 x 2 x 2 

1 — x — x k+2 (t — 1 )— 1 — x — x k+2 {t — 1 )— 1 — x — x k+2 (t — 1 )— 1 — X 

" -V-' 

depth ra+1 


for m > 1 and 


P pltfc O,t) 


1 — x — x k+2 {t — 1)— 1 — x — x k+2 {t — 1)— 1 — x — x k+2 (t — 1) — 
1 — x — x k+2 (t — 1) — \/(l — x — x k+2 {t — l)) 2 — 4x 2 

2P 2 ' 


3 These are sometimes also called k-humps in the literature. 
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The two formulae for F pltk (x, t ) were found earlier by Drake and Gantner j3j Proposition 
3.4 and Theorem 4.2] using a different method; here we give a proof using our generalization 
of the cluster method. 

Proof. Set B = {UF k D}, and once again consider the Motzkin monoid network of order m. 
The only cluster formed by UF k D is itself, and so the (m + 1) x (m + 1) cluster matrix is 



UF k Dt 

UF k Dt 

UF k Dt 


UF k Dt 

0 


By Theorem [T] we have %(t) ~ (An ~ L G (t)) 1 (where A m is dehned in the proof of 
Theorem [6]), and so Fm tk (x, t) is the (1,1) entry of M m 1 where M m is the matrix 

T — x — x k+2 (t — 1) —x 

—x 1 — x — x k+2 (t — 1) —x 

—x 1 — x — x k+2 (t — 1) 


1 — x — x k+2 (t — 1) —X 
—x 1 — X_ 

obtained by applying to A m — L G {t ) the homomorphism sending each of U, F, and D to x. 
ft follows that 




det M m i 


det M m ’ 

and the determinant of M m is equal to that of an upper-triangular matrix with diagonal 
entries 


^i,i 


1 — X — x k+2 (t — 1 ) 


X 


V'i+lji+l 


1 — X. 


if 1 < i < m 

if i — m + 1. 


Thus we have the recursion 


m+1 

det M m = ]^[ Uij 

i=1 


= I 1 — x — x k+2 (t — 1) — 


X 


det M m -1 
det M m -2 


det 


m—1 


= (1 — x — x k+2 (t — 1)) det M m _i — x 2 det M m _ 2 
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with initial conditions det M_i = 1 and det M 0 = 1 — x. These are polynomials, and we 
write Pm tk = detM m _!. Moreover, 


det M m 
det M m _ i 


= 1 - x - x k+ \t - 1) - 


x 


X 


X 


1 — x — x k+2 {t — 1)— 1 — x — x k+2 {t — 1)— 1 — x’ 


depth m 


SO 


+**(*,«) = 


X 


X 


X 


1 — x — x k+2 (t — 1)— 1 — x — x k+2 (t — 1)— 1 — x — x k+2 (t — 1)— 1 — X 


Taking the limit as m —» oo, we obtain 

1 


F pltk (x,t) = 


-- 

depth rn +1 


X 


X 


1 — X — X k+2 (t — 1 )— 1 — X — X k+2 {t — 1 )— 1 — X — X k+2 (t — 1 )- 
1 


1 — x — x k+2 {t — 1) — x 2 F plt k(x, t) 
which can be rewritten as 
„2 


x-F pltfc (x, t) 2 - (1 - x - x k+2 (t - l))F pltfc (x, t) + 1 = 0. 


( 8 ) 


Solving (J8]) gives 


lt 1 — x — x fc+2 (t — 1) ± \J (1 — x — x k+2 (t — l)) 2 — 4x 2 

F '(x,t) = - ^5 -, 

but one can check that the subtraction solution is the correct one. 


□ 


By specializing to k = 0 and defining peak = plt 0 , we obtain the bivariate generating 
function 

1 — x — x 2 (t — 1) — \J (1 — x — x 2 (t — l)) 2 — 4x 2 


F peak (x, t) = 


2x 2 


for Motzkin paths by peaks , which are occurrences of UD. The first several terms of 
F pcak (x,f) are in the following table: 


n 

[x n ] F peak (x,t) 

n 

[x n ] F peak (x,t) 

0 

1 

5 

8 + lOt + 31 2 

1 

1 

6 

17 + 24t + 91 2 + t 3 

2 

1 + t 

7 

37 + 58 1 + 28 1 2 + 4t 3 

3 

2 + 2 1 

8 

82 + 143t + 81t 2 + 16t 3 + t 4 

4 

4 + 4 + t 2 

9 

185 + 354t + 231t 2 + 60f 3 + 5t 4 


See |15j A097860] for its OEIS entry. Also see A004148] for the constant coefficients of 
these polynomials, which count Motzkin paths with no peaks. The generating function for 
the linear coefficients of these polynomials can be verified to be 

1 — x + x 2 — \/l — 2x — x 2 — 2x 3 + x 4 


|^ peak (x,f) 


t=0 


2\/l — 2x — x 2 — 2x 3 + 


x^ 
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and interestingly enough, dividing this generating function by x (i.e., shifting the indices 
of the underlying sequence) yields the generating function for the number of flat steps in 
all peakless Motzkin paths of length n (see |T51 Al 10236]). These numbers are given by a 
binomial coefficient sum, which in turn gives us the following: 


Corollary 11. The number of Motzkin paths of length n > 2 with exactly one peak is equal 

<»EEoCth)(A)- 

Now let us consider 1-plateaus, or occurrences of UFD. The bivariate generating function 

1 — x — x 3 (t — 1) — a/(1 — x — x 3 (t — l)) 2 — 4x 2 


F pltl (x, t) = 


2x 2 


counts Motzkin paths by 1-plateaus, and its first several terms are: 


n 

[x n ] F pltl (x, t) 

n 

[: x n ] F pltl (x, t) 

0 

1 

5 

15 + 6 1 

1 

1 

6 

36 + 14t + t 2 

2 

2 

7 

85 + 39 1 + 3t 2 

3 

3 + t 

8 

209 + 102t + 12t 2 

4 

7 + 2 1 

9 

517 + 280t + 37t 2 + t 3 


These are also in the OEIS [151 A114583], along with the constant coefficients [T5] A114584], 
which count Motzkin paths with no occurrences of UFD. 

We can also count Motzkin paths by all plateaus, without a fixed k. Let pit (/x) be the 
number of plateaus in a Motzkin path p, that is, the number of occurrences of subwords in 
B = {UD, UFD , UFFD ,... We define the bivariate generating functions F plt (+ t) and 
F plt (x,t) in the same way as before, and to determine these generating functions, we would 
change each nonzero entry in the cluster matrix from UF k Dt (for a fixed k ) to 


Y UF k Dt = U(l - F)~ 1 Dt. 

k=0 

Then the computation would follow in the same way, yielding the following result: 
Theorem 12. Let {R^f^x, £)} m >o be a sequence of rational functions recursively defined by 

t)= ^1 - z - yrfY ~ ^m-i (-A t) - x 2 R^_ 2 {x, t) 

for m > 2 and B^f x, t) = 1 and R plt (x , t) — 1 — x. Then 

= Rm(x,t) 

R m+lfat) 

1 x 2 x 2 x 2 

~ l-x-fyt-l)- 1-x-fyt-l)-'" l-x-fyt-l)- T ^ 

"-V-' 

depth ra+1 
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for m > 1 and 


F plt (x, t) 


" 

1 — 2x — x 2 (t — 2) — a/ 1 — 4a; — 2x 2 (t — 2) + 4x 3 t + x A t{t — 4) 


2(x 2 


or 


The first several terms of F plt (x, t) are below, which can also be found on the OEIS [15 . 
A097229]: 


n 

[x n ] F plt (x, t ) 

n 

[x 11 ] F plt (x, t ) 

0 

1 

5 

1 + 15t + 5 1 2 

1 

1 

6 

1 + 311 + 18t 2 + t 3 

2 

1 + t 

7 

1 + 63t + 56t 2 + 7t 3 

3 

1 + 3 1 

8 

1 + 127t + 160t 2 + 34t 3 + t 4 

4 

1 + 7 t + t 2 

9 

1 + 255t + 432t 2 + 138t 3 + 9t 4 


We now give expressions for the linear and quadratic coefficients of these polynomials. 

Corollary 13. The number of Motzkin paths of length n > 1 with exactly one plateau is 
equal to 2 n ~ 1 — 1. 

Proof. Using Maple, one may verify that 


Then, 


d_ 

dt 


F pl \x,t) 


t =o 


(1 — 2a;) (1 — x) 


[x n t] F pl \x,t) 


\x 


d_ 

dt 


F pl \x,t ) 


J t =o 


\x 


n— 2n 


„n— 2n 


(1 - 2x)(l-x) 
2 1 


x 


1 — 2x 1 — x 


OO 

[x n ~ 2 ] ( ^^(2 n+1 — 1 )a; ri ) 

n=0 

2 n-l _ L 


□ 

Corollary 14. The number of Motzkin paths of length n > 3 with exactly two plateaus is 
equal to (n — 3)?r2" -6 . 
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2(1 — x)x 4 


Proof. Using Maple, one may verify that 


t=o (! - 2 ^) 3 

and 

(1 — x)x 
(1-2 a;) 3 

is known to be the generating function for the sequence (n(?r+3)2 n ~ 3 ) n >i (see pEHl A001793]). 
Then, 

[x n t 2 ] F plt (x, t) = 


□ 

Hence, Motzkin paths with exactly 1 plateau and those with exactly 2 plateaus are 
equinumerous with many other combinatorial objects (see [ 151 A000225 and A001793]). 

Drake and Gantner j3[ Section 4] showed how one can find continued fraction formulae 
for variations of these results, including bivariate generating functions for counting Motzkin 
paths by plateaus occurring only at certain heights, and with restrictions on the lengths of 
plateaus. Their approach involved inserting appropriate “correction terms” at each level of 
the continued fraction formulae that encode the types of plateaus that they wish to count. 

All of these variations can also be computed using our method. To disregard plateaus 
occurring at certain heights, we would delete the corresponding rows from the cluster matrix, 
which is completely analogous to Theorem [9] for ascents. To place restrictions on the lengths 
of plateaus, we would alter the “forbidden set” B appropriately and set the appropriate 
indeterminates to 0. We leave the details to the reader. 

Our method also allows for an interpretation of Drake and Gantner’s correction terms 
in terms of clusters. Their correction terms are of the form x k (t — 1) for various k and are 
then multiplied by x 2 , and these precisely correspond to the terms contributed by the cluster 
matrix in our computations. This is a relatively simple case because the only clusters formed 
by the words in B = { UD , UFD , UFFD ,... } are the words in B themselves. Counting 
paths by subwords having additional clusters would require more complicated correction 
terms when working through the lens of Drake and Gantner. 

4.5. Counting Motzkin paths by peaks and valleys 

Peaks, or occurrences of UD, were introduced in the previous subsection. Similarly, we define 
a valley to be an occurrence of DU, and val /x the number of valleys of a Motzkin path /x. 
Here find the joint distribution of peaks and valleys in Motzkin paths. Let 

oo oo 

F^ v (x,t l ,t 2 ) = tr k,J t? lli x n and F p ’ v (x, t u t 2 ) = 

n=0 / 1 G.M™ n=0 fj,£M n 






t =0 


\x 


,-31 (! ~ x ) x 

J (1 - 2xf 
(n - 3)n2 ra “ 6 . 


(F_ 

dt 2 


F pX \x,t ) 
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Then we have the following: 


Theorem 15. Let { R p ff (x, ti, i 2 )}m>o be a sequence of rational functions recursively defined 
by 

Rf n v (x,t ll t 2 ) = (1 - x - Ci - C 2 )R% l -i(x,ti,t 2 ) - (x + Cz) 2 R^_ 2 (xMM) 
for m > 2 and R p,v (x, t\,t 2 ) = 1 and R p ’ v (x, ti,t 2 ) = 1 — x — C 2 , where 

= x 2 (t\ - 1) = x 2 {t 2 - 1) = x 3 (ti - 1 )(t 2 - 1) 

1 1 — x 2 (ti — l)(f 2 — 1) ’ 2 3 l-a: 2 (fi-l)(f2-l)' 


T/ien 




_ flfc V (x,ti,t 2 ) _ 

(1 - x - C 1 )Rfn V (x,t 1 ,t 2 ) - (£ + C 3 ) 2 i?f n l 1 (a;,ti,t 2 ) 

1 _ (x + C 3 ) 2 _ (x + C 3 ) 2 (x + C 3 ) 2 

1 — a; — Ci— 1 — x — Ci — C 2 — 1 — x — C\ — C 2 — 1 — x — C 2 

^-v---- 

depth m+1 


for m > 1 and 


F p ’ v (x,t i,t 2 ) 


1 _ (x + C 3 ) 2 (x + C 3 ) 2 

1 — x — Ci— 1 — a: — Ci — C 2 — 1 — a; — Ci — C 2 — 

2 

1 — x — Ci + C 2 + ^(l — x — Ci — C 2 ) 2 — 4(a: + C 3 ) 2 


Proof. Set -B = {LhD,D£/}. This time, we weight occurrences of UD by ti and occurrences 
of DU by t 2 . However, finding the cluster matrix is no longer a trivial task. We make the 
following observations: 


• Clusters starting and ending at height 0 are of the form UDUD ■ ■ - UD, since a path 
cannot go down from height 0. We can decompose these words into a sequence of UDs, 
where the first UD contributes a t\ and each subsequent UD contributes a t\ and a t 2 . 


• Clusters starting and ending at height m are of the form DU DU ■ ■ ■ DU , since a path 
cannot go up from height m. We can decompose these words into a sequence of DU s, 
where the first DU contributes a t 2 and each subsequent DU contributes a t\ and a t 2 . 


• Clusters starting and ending at height k with 0 < k < m are of the above two forms, 
since a path can go either up or down from height k. 


• Clusters starting at height k and ending at height k+ 1 are of the form U DU DU ■ ■ ■ DU , 
which can be decomposed into an initial subword UDU —contributing a C and a t 2 — 
and a sequence of DU s, each contributing a t\ and a t 2 . 


• Clusters starting at height k and ending at height k— 1 are of the form DU DU D ■ ■ - UD, 
which can be decomposed into an initial subword DUD —contributing a t.\ and a t 2 — 
and a sequence of UDs, each contributing a t\ and a t 2 . 
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Thus, the (m + 1) x (m + 1) cluster matrix is 


~r^ ( \ 

J-'GV'li t2) = 


'C\ c 3 
c 4 c x + c 2 c 3 

C 4 Cl + c 2 


where 


Ci = 


UDt i 


1 - UDtit 2 


, C 2 = 




i - DUhh 


• C, + C 2 c 3 
c 4 C 2J 


* UDUt\t 2 * DU Dt\t 2 
1 C 3 = --7777T , ) ^4 = 


1 - DUt 4 t 2 


1 - UDt it 2 


By applying Theorem U we see that F^ v (x,ti,h) is the (1,1) entry of M m x where M„ 
is the (m + 1 ) x (m + 1 ) matrix 


M m = 


1 — x — C\ —x — C 3 
—x — C 3 1 — x — Ci — C 2 

-x - C 3 


-x-C 3 
1 - x - Ci - C 2 


l — x — Ci — C 2 —x — C 3 
—x — C 3 1 — x — C- 2 


and Ci, C 2 , C 3 defined in the statement of this theorem. Then, 

det ML 


FS; y (x,t u h) = 


det M„ 


det ML 


1 — x — Ci 


(. X+C3 ) 2 
det 


det M' m 


det ML 


(1 — x — Ci) det ML — (x + C 3) 2 det M' m -i 


where M' m is the matrix obtained from M m by deleting the first row and the first column. 
The determinant of M' m is equal to that of an upper-triangular matrix with diagonal entries 


Ui 


\l-x-Ci-C 2 - 
|l - x - C 2 , 


(x + C 3 ) 2 

^i+l,z+l 


if 1 < i < m 
if i = m + 1 , 
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so 


m+1 

det M' m = ]^[ u iti 

i =1 


= 1 - x - C, - C 2 - 


(x + C 3 ) 2 

( dct M' n _ 1 ' 
\det M’ m _ 2 


det ML 


m— 1 


= (1 - x - C x - C 2 ) det M' m _ l - (x + C 3 ) 2 det M ' m _ 2 


with initial conditions det Mg = 1 and detM( = 1 — a; — C' 2 . These are rational functions, 
and we write = detM+ Furthermore, 


det M m _ (x + C3) 2 (x + C3) 2 (x + C3) 2 

det M' m ~ ~' 7 ’~ 1 “ 1 - x - Ci - C 2 - " ’ 1 - x - C x - C 2 - 1 - x - C 2 ’ 

^ V y 

depth m 


SO 


m 


(x,t 1 ,t 2 ) 


1 _ (a; + C 3 ) 2 _ (x + C3) 2 (X + C 3 ) 2 

1 — x — Ci— 1 — x — Ci — C2— 1 — x — Ci — C2— 1 — x — C2 

'- ... -' 

depth m+1 


By taking the limit as m —> 00 , we have that 


F p ’ v (x,fi,f 2 ) 


1 _ (x + C3) 2 (x + C3) 2 

1 — x — Ci— 1 — x — Ci — C 2 — 1 — x — Ci — C 2 — 
1 

1 - x - Ci - (x + C 3 ) 2 C(x, ti, t 2 ) 


where 


C(x, ti, t 2 ) 


_1_ (x + C 3 ) 2 (x + C 3 ) 2 

1 — x — Ci — C 2 1 — x — Ci — C 2 — 1 — x — Ci — C 2 — 
1 

1 — x — Ci — C 2 — (x + C 3 ) 2 G(x, C, t 2 ) 


Thus we have the functional equation 

(x + C 3 ) 2 G(x, C, — (1 — x — Ci — C 2 )G(x, G, t 2 ) + 1 = 0 , 
and solving it gives 


G(x,G,f 2 ) 


1 - x - Ci - C 2 ± +(1 - x - Ci - C 2 ) 2 - 4(x + C 3 ) 2 

2(x + C 3 ) 2 
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As before, one can verify that the subtraction solution is the correct one, and we conclude 
that 


F p ’ v (x,t u t 2 ) = 


1 - X - Cl - i(l - X - C x - C 2 - v/(l - x - Cl - C 2 ) 2 - 4(x + C 3 

2 

1 - X - Cl + C 2 + y/(l - X - Cl - C 2 y - 4(x + C 3 ) 2 ' 


□ 


The first several terms of F p,v {x,t\,t 2 ) are the following: 


n 

O' 

1 


x 


F p ’ v (x,t i,t 2 ) 
1 
1 


2 


1 + 1\ 


3 

4 

5 

6 

7 

8 


2 + 2ti 

4 + 4ti + t\t 2 

8 + 8 ti + 2t\t 2 + t \ + 2Pt 2 
16 + t 2 + 18ti + 6t\t 2 + 3 t 2 + Qt 2 t 2 T 2 
33 + 4 1 2 + 40ti + 18tit 2 + + 16t?t 2 + 3 t\t\ + 2 t\t 2 + 2tft| 

69 + 13t 2 + 901 1 + 501il 2 + 25lf + 31il| + 47 t\t 2 + t\ + 9 t\t 2 + 6 lfl 2 + 9t\t 2 + 


The constant coefficients, which count Motzkin paths with no peaks and valleys, are in the 
OEIS US A004149], 

Liu et al. [13] gave recursive and continued fraction formulae for counting Dyck paths 
with peaks avoiding a specified set of heights and valleys avoiding another specified set of 
heights. We can do the same thing by applying our cluster method to the monoid network 
for Dyck paths, but here we give the analogous results for Motzkin paths. Note that Liu et 
al. defined the height of a peak (respectively, valley) to be the height at which its down step 
(respectively, up step) occurs, but we use the convention that the height of a peak or valley 
is the height at which the corresponding subword (UD or DU) begins. 


Theorem 16. Let 

OO 

F P ’ V (P 1 V ] x) = J2 C n^ n 

n =0 


where c n is the number of Motzkin paths of length n with every peak occuring at a height in 
PC N and every valley occuring at a height inV CP. Then, 


P P,V (P, V; x) = 


(■x + C 3j0 )" 


(x + C 3 j i)' 


1 — x + C\ o— 1 — x + Ci,i + C 2 ) i— 1 — x + Ci , 2 + C 2)2 — 


where 


( X 2 

1 1—x 2 5 

if i fi P and i + 1 ^ V 

( X 2 

1 1—x 2 ’ 

ififiV and i — 1 ^ P 

C M = X 2 , 

if i f P and i + 1 E V 

C'a.i = a; 2 , 

if i V and i — 1 £ P 

lo. 

otherwise , 

lo. 

otherwise , 
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and 


C^ = 


{; 


if i f. P and i + 1 ^ V 
otherwise. 


Proof. We weight both UD and DU by t, but we only wish to consider instances of UD at 
heights i P and instances of DU at heights i V. We claim that the cluster matrix is 


Lc(t) - 


( i.n ( 'a.n 

C*4,l Cl,l + C*2,l 0.3,1 

C42 C\2 + C‘2 2 


Uim—l + C*2 

C *2 ,m . 


where 



f UD(t—l) 
l-UD(t-l ) 2 ’ 


if i £ P, i + 1 £ 1 / 
ifi^P, i + leh 


f 


mm- 1 ) 


l-£/D(i-l) 2 ’ 

c ' 2 , 8 = <; Dc/(t- 1 ), 


if i £ y, i - 1 i P 
if i £ V, i - 1 e P 


. 0 , 


otherwise, 


0 , 


otherwise, 


C 3ti = 


UDU(t-l) 
l-DU(t-l) 2 ’ 

0 , 


if i (f P, i + 1 i V 
otherwise, 



DUD(t- 1 ) 
l—UD(t—l) 2 ’ 

0 , 


if i (f V, i — 1 i P 
otherwise. 


For example, C\, t gives clusters starting and ending at height i and beginning with an up 
step. Every such cluster begins with a peak, so if % € P, then C\j = 0. Otherwise, % ^ P, 
and if 1 + 1 G V, then the only possible such cluster is PP because all other possible clusters 
begin with UD and are followed by a valley at height i + 1. However, if i (f P and i + 1 ^ V, 
then every subword of the form UDUD ■ ■ ■ is a valid cluster. One can verify the formulae 
for C 2) n &3,i, C4.1 using similar reasoning, and the result follows from the same process as 
before. □ 


Below are the generating functions for Motzkin paths with parity restrictions on the 
heights of peaks and valleys: 


(*+A ) 2 


P p,v (®. E> 0 ; x) = 


x 


1-X+ 1-X+ 1-X + J^ 

1—x^ l—x z 1—x 


(*+*) 2 


X 


l-X+-^~ 1 -X+ 

1—x 2 1—x 


1 — 2x + 2a: 2 — 2a: 4 — Ul^AxpAoU^AoU 
2 a: 2 (l — x + x 3 ) 

= 1 + x + x 2 + 2x 3 + 5a: 4 + 12a: 5 + 27a: 6 + 60a: 7 + 136a: 8 + • 
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1 


F p ’ v (E> 0 , O; x) = 


x 


(^ + T ^) 2 


1-X- 1-X+ 1 - x + 

1— x z 1—a 


* 2 (* + tS ) 2 

l-X+^3- 1-*+^ 


2(1 — x + x 3 ) 


1 — 2x + 2x 3 + y/1 — 4x + 4x 2 — 4x 4 
= 1 + x + 2x 2 + 4x 3 + 8 x 4 + 17x 5 + 38x 6 + 88 x 7 + 208x 8 + 


F p ’ v (0, O; x) 


1 x 2 x 2 x 2 x 2 

1 — x + x 2 — 1 — x— 1 — x + 2x 2 — 1 — x— 1 — x + 2x 2 — 

2 (1-x) 

1 — 2x + x 2 + y/1 — 4x + 6 x 2 — 8 x 3 + 5x 4 — 4x 5 + 4x 6 
1 + x + x 2 + 2x 3 + 5x 4 + 12x 5 + 27x 6 + 60x 7 + 137x 8 + • • • 


F p ’ v (E>o,E> 0 ;x) 


1 x 2 x 2 x 2 x 2 

1 — x— 1 — x + 2 x 2 — 1 — x— 1 — x + 2 x 2 — 1 — x— 

1 — 2x + 3x 2 — 2x 3 — y/1 — 4x + 6 x 2 — 8 x 3 + 5x 4 — 4x 5 + 4x 6 


2 x 2 (l — x) 

1 + x + 2x 2 + 4x 3 + 7x 4 + 13x 5 + 27x 6 + 59x 7 + 131x 8 H- 


We note that the list of coefficients of F P,V (E> 0 , O; x) in particular is a shifted version of the 
OEIS sequence [ 151 A025276], which can be verified by comparing generating functions. 
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